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DISCRIMINATION  OF  PATHOGENIC  VERSUS 
NON-PATHOGENIC  YERSINIA  PESTIS  AND 
ESCHERICHIA  COZ./ USING  PROTEOMICS  MASS  SPECTROMETRY 


1.  INTRODUCTION 

Recently,  mass  spectrometry  (MS)  analysis  has  proven  useful  in  the 
characterization  and  identification  of  biological  agents  using  a  protcomic  approach  (1). 

Therefore,  the  present  study  sought  to  determine  whether  protcomics  MS  could  be  used  to 
distinguish  between  pathogenic  and  non-pathogcnic  strains  of  the  same  organism.  More 
specifically,  discrimination  between  pathogenic  and  non-pathogcnic  organisms  based  on  their 
outer  membrane  protein  (OMP)  composition,  as  determined  by  MS,  was  investigated. 

OMPs  of  gram-negative  bacteria  act  as  active  mediators  between  the  cell  and  its 
environment  and  arc  often  associated  with  virulence  in  gram-negative  pathogens.  In  pathogenic 
Escherichia  coJi,  there  arc  multiple  OMPs  present,  which  arc  needed  for  intestinal  colonization, 
as  well  as  those  that  play  a  role  in  the  type  III  secretion  system  responsible  for  delivering  effector 
proteins  to  host  cells  (2,  3,  4,  5,  6).  Virulent  Yersinia pestis  contains  three  plasmids  encoding 
multiple  OMPs  that  arc  required  for  virulence  (7,  8,  9).  For  example  the  pCDl  plasmid  encodes 
several  Yersinia  outer  membrane  proteins  (YOPs)  and  a  type  III  secretion  system,  which  arc 
needed  for  survival  and  entry  into  eukaryotic  cells  (10,  11).  Additionally,  the  pPCPl  plasmid 
encodes  an  OMP  plasminogen  activator  that  interferes  with  clotting  and  complement  (12). 
Avirulcnt  strains  often  lack  one  or  more  of  the  plasmids  or  genes  encoding  proteins  needed  for 
virulence,  and  it  is  these  differences  in  OMP  expression  between  virulent  and  avirulcnt  strains  of 
gram-negative  bacteria  that  could  potentially  be  exploited  to  distinguish  among  strains. 

Therefore,  OMPs  could  prove  to  be  excellent  model  biomarkers  for  strain  differentiation  among 
bacteria. 


The  objective  of  the  present  study  was  to  establish  the  sequenee-based  identity  of 
OMPs  isolated  from  pathogenic  and  non-pathogcnic  strains  of  Y.  pestis  and  E.  coli.  Y.  pestis  is 
classified  as  a  Category  A  pathogen  and  is  an  important  potential  biological  warfare  agent. 
Pathogenic  E.  coli,  such  as  E.  coli  0157:117,  is  an  important  public  health  pathogen  responsible 
for  most  common  food  borne  and  waterborne  illnesses  in  the  United  States.  High-throughput 
protcomic  analytical  systems  were  applied,  providing  a  rapid  means  of  characterizing  cellular 
proteins  and  producing  amino  aeid  sequence  information  for  peptides  derived  from  these 
proteins. 


This  1  year  basic  research  study  aimed  to;  1)  isolate  OMPs  using 
ultracentrifugation  and  differential  extractions;  2)  determine  sequence  and  post-translational 
modifications  to  amino  acid  residues  composing  membrane  proteins  using  emerging  high- 
throughput  mass  spectral  protcomic  systems;  and  3)  use  bioinformatics  modeling  tools  to 
establish  strain  differentiation  methods  based  on  the  protcomc  differences  among  the  Y.  pestis 
and  E.  coli  strains. 
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In  addition  to  the  aims  described  above,  discrimination  among  strains  of 
an  additional  agent  of  interest,  Bacillus  anthracis,  was  also  investigated.  Because 
B.  anthracis  is  a  gram-positive  organism  and  therefore,  lacks  an  outer  membrane,  total 
cellular  proteins  (whole  cell  lysates)  were  analyzed  for  discrimination  via  mass  spectrometry 
rather  than  OMPs. 


2.  MATERIALS  AND  METHODS 

2.1  Materials  and  Reagents. 

Ammonium  bicarbonate,  dithiothcritol,  urea,  acctonitrilc-HPLC  grade,  and  formic 
acid  were  purchased  from  Burdick  and  Jackson  (St.  Louis,  MO).  Sequencing  grade  modified 
trypsin  was  purchased  from  Promega  (Madison,  WI). 

2.2  Bacterial  Strains  and  Culture  Conditions. 

Pathogenic  strains  used  in  the  present  study  were:  E.  coli  0157:H7,  Y.  pestis 
Colorado  92  (C092),  and  B.  anthracis  Ames.  Non-pathogenic  strains  used  were  E.  coli  K12,  K 
pestis  A1 122,  and  B.  anthracis  Sterne.  Working  cultures  were  prepared  by  streaking  cells  from 
cryo-prcscrvcd  stocks  onto  tryptic  soy  agar  (TSA)  followed  by  incubation  for  approximately 
1 8  h  at  37  °C  for  E.  coli  and  B.  anthracis  strains  and  30  °C  for  Y.  pestis  strains.  After  incubation, 
all  working  culture  plates  were  stored  at  4  “C.  Cells  from  working  cultures  were  used  to 
inoculate  broth  cultures  for  each  strain,  which  consisted  of  1 00  mL  of  tryptic  soy  broth  (TSB)  for 
E.  coli  and  B.  anthracis  strains  and  100  mL  of  brain  heart  infusion  (BHI)  for  Y.  pestis  strains.  All 
cultures  were  ineubated  for  approximately  1 8  h  at  37  “C  for  E.  coli  and  B.  anthracis  strains  and 
30  °C  for  Y.  pestis  strains  with  rotary  aeration  at  180  rpm.  After  ineubation,  broth  eultures  were 
pelleted  by  eentrifugation  (2,300  RCF  at  4  "C  for  10  min),  washed  and  resuspended  in  10  mL 
HEPES  buffer  followed  by  heating  at  95  °C  for  1  h  to  lyse  eells.  After  heating,  a  portion  of  eaeh 
sample  was  plated  onto  TSA  and  ineubated  for  5  days  at  appropriate  temperature  to  ensure  no 
growth  prior  to  removing  samples  from  the  BSL-2  or  BSL-3  laboratory  for  further  proeessing. 
Total  cellular  protein  samples  (whole  eell  lysates)  were  eomplete  after  heating  for  the  1  h  and 
were  transferred  to  Point  Deteetion  Braneh  for  analysis  after  no  growth  on  plates  was  eonfirmed. 
For  OMP  samples,  samples  were  proeessed  for  OMP  isolation  as  deseribed  below  prior  to  being 
transferred  to  Point  Deteetion  Braneh  for  analysis. 

2.3  OMP  Isolation. 

After  lysis  by  heating  at  95  “C  for  1  h,  eell  debris  was  pelleted  by  centrifugation  at 
2,300  RCF  at  4  °C  for  10  min.  The  supernatant  was  then  centrifuged  at  100,000  x  g  for  1  h  to 
pellet  proteins.  The  pellet  was  resuspended  in  1  mL  of  HEPES  buffer,  and  1  mL  of  a  2%  sarkosyl 
solution  (N-Lauroylsarcosinc  sodium  salt  solution)  was  added  and  sample  was  ineubated  at  room 
temperature  for  30  min.  Next  samples  were  eentrifuged  at  100,000  x  g  for  1  h,  and  the  pellet 
eontaining  OMPs  was  resuspended  in  1  mL  of  HEPES  buffer  and  then  transferred  to  Point 
Deteetion  Braneh  for  further  processing  and  analysis  as  described  below. 
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2.4 


Processing  of  Whole  Cell  Lysates  and  OMP  Samples. 


All  protein  samples  were  ultra-sonieated  (20  s  pulse  on,  5  s  pulse  off,  and  25% 
amplitude  for  5  min  duration),  and  a  small  portion  of  lysates  was  reserved  for  1-D  gel  analysis. 
The  lysates  were  centrifuged  at  14,100  x  g  for  30  min  to  remove  any  debris.  The  supernatant 
was  then  added  to  a  Microeon  YM-3  filter  unit  (Millipore;  Cat  #;  42404)  and  eentrifuged  at 
14,100  X  g  for  30  min.  The  effluent  was  discarded.  The  filter  membrane  was  washed  with 
100  mM  ammonium  bicarbonate  (ABC)  and  eentrifuged  for  15-20  min  at  14,100  x  g.  Proteins 
were  denatured  by  adding  8  M  urea  and  3  pg/pL  Dithiotheritol  (DTT)  to  the  filter  and  ineubating 
overnight  at  37  °C  on  an  orbital  shaker  set  to  60  rpm.  Twenty  microlitcrs  of  100%  of  acetonitrile 
(ACN)  was  added  to  the  tubes  and  allowed  incubate  at  room  temperature  for  5  min.  The  tubes 
were  then  centrifuged  at  14,100  x  g  for  30-40  min  and  washed  three  times  using  150  pL  of 
100  mM  ABC  solution.  On  the  last  wash,  ABC  was  allowed  to  sit  on  the  membrane  for  20  min 
while  shaking,  followed  by  eentrifugation  at  14,100  x  g  for  30-40  min.  The  mieron  filter  unit 
was  then  transferred  to  a  new  receptor  tube  and  proteins  were  digested  with  5  pL  trypsin  in 
240  pL  of  ABC  solution  +  5  pL  ACN.  Proteins  were  digested  overnight  at  37  °C  on  an  orbital 
shaker  set  to  55  rpm.  Sixty  microlitcrs  of  5%  ACN/0.5%  fonnic  acid  (FA)  was  added  to  each 
filter  to  quench  the  trypsin  digestion  followed  by  2  min  of  vortexing  for  sample  mixing.  The 
tubes  were  centrifuged  for  20-30  min  at  14,100  x  g.  An  additional  60  pL  5%  ACN/0.5%  FA 
mixture  was  added  to  filter  and  centrifuged.  The  effluent  was  then  analyzed  using  the  LC- 
MS/MS  teehnique. 

2.5  Protein  Database  and  Database  Search  Engine. 

A  protein  databa.se  was  constructed  in  a  FASTA  format  using  the  annotated 
bacterial  proteome  sequences  derived  from  fully  sequeneed  ehromosomes  of  881  baeteria. 
ineluding  their  sequeneed  plasmids  (as  of  April  2009).  A  PERL  program 
(http://www.activcstate.com  Products/ ActivePerl:  accessed  April  2009)  was  written  to 
automatieally  download  these  sequenees  from  the  National  Institutes  of  Health  National  Center 
for  Bioteehnology  (NCBI)  site  (http: '  'www.ncbi.nlm.nih. gov :  accessed  April  2009).  Each 
database  protein  sequence  was  supplemented  with  information  about  a  source  organism  and  a 
genomic  position  of  the  respective  ORF  embedded  into  a  header  line.  The  database  of  baeterial 
proteomes  was  constructed  by  translating  putative  protein-eoding  genes  and  eonsists  of  tens  of 
millions  of  amino  acid  sequences  of  potential  tryptie  peptides  obtained  by  the  in  silica  digestion 
of  all  proteins  (assuming  up  to  two  missed  cleavages). 

The  experimental  MS/MS  spectral  data  of  bacterial  peptides  were  searched 
using  SEQUEST  algorithm  against  a  constructed  proteome  database  of  microorganisms.  The 
SEQUEST  thresholds  for  searching  the  product  ion  mass  spectra  of  peptides  were  Xcorr, 
dcltaCn,  Sp,  RSp,  and  dcltaMpcp.  These  parameters  provided  a  uniform  matching  score  of  all 
candidate  peptides.  The  generated  outfilcs  of  these  candidate  peptides  were  then  validated  using 
a  peptide  prophet  algorithm.  Peptide  sequences  with  a  probability  score  of  95%  and  higher  were 
retained  in  the  dataset  and  used  to  generate  a  binary  matrix  of  scqucncc-to-bactcrium 
assignments.  The  binary  matrix  assignment  was  populated  by  matching  the  peptides  with 
corresponding  proteins  in  the  database  and  assigning  a  score  of  1 .  A  score  of  zero  was  assigned 
for  a  non  match.  The  column  in  the  binary  matrix  represents  the  proteome  of  a  given  bacterium 
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and  each  row  represents  a  tryptie  peptide  sequence  from  the  LC-MS/MS  analysis. 
Microorganisms  were  matched  with  the  baeterium/baeteria  based  on  the  number  of  unique 
peptides  that  remained  after  further  filtering  of  degenerate  peptides  from  the  binary  matrix. 
Verifieation  of  the  classification  and  identification  of  candidate  microorganisms  was  performed 
through  hierarchieal  elustering  analysis  and  taxonomic  classification. 

The  in-house  developed  software  called  “BACid”  transformed  results  of 
searching  MS/MS  spectra  of  peptide  ions  against  a  custom  protein  database  which  was 
downloaded  from  NCBI  with  commercial  software  SEQUEST  into  a  taxonomieally  meaningful 
and  easy  to  interpret  output.  It  calculated  probabilities  that  peptide  sequence  assignment  to  a 
MS/MS  spectrum  was  correct  and  used  accepted  spectrum-to-scquencc  matches  to  generate  a 
sequenec-to-bactcrium  (STB)  binary  matrix  of  assignments.  Validated  peptide  sequences, 
differentially  present  or  absent  in  various  strains  (STB  matrices)  were  visualized  as  assignment 
bitmaps  and  analyzed  by  a  BACid  module  that  used  phylogenetic  relationships  among  bacterial 
species  as  a  part  of  decision  tree  process.  The  bacterial  elassifieation  and  identification  algorithm 
used  assignments  of  organisms  to  taxonomic  groups  (phylogenetic  classification)  based  on  an 
organized  scheme  that  begins  at  the  phylum  level  and  follows  through  classes,  orders,  families 
and  genus  down  to  strain  level.  BACid  was  developed  in-house  using  PERL,  MATLAB  and 
Microsoft  Visual  Basie. 


3.  RESULTS  AND  DISCUSSION 

The  current  project  characterized  and  identified  pathogenic  and  non-pathogenic 
strains  of  the  same  organism  based  on  proteins  present  in  whole  cell  lysates  (global)  versus 
OMP  preparations  (specific).  All  results  arc  shown  and  discussed  below.  B.  anthracis  Ames  and 
Sterne  strains  were  also  included  to  expand  the  project;  however,  analysis  of  whole  cell  lysates 
only  were  included  and  results  discussed  below. 

Figure  1  below  serves  as  an  example  to  illustrate  the  typical  output  generated  for 
the  LC-ESl  MS/MS  analyses  of  bacterial  proteins  digest  using  bioinformatics  tools  to  process  the 
peptide  sequence  information  for  the  bacterial  differentiation  and  classification.  The  top  window 
lists  the  identified  unique  proteins  and  their  corresponding  bacterium  match.  The  lower  window 
represents  the  binary  matrix  of  the  scqucnec-to-bacterium  search  matching.  The  total  row,  lower 
window,  represents  the  total  number  of  unique  proteins  identified  for  a  given  bacterium.  Figure  2 
also  serves  as  an  example  and  shows  the  histogram  generated  by  plotting  the  number  of  unique 
proteins  versus  the  bacterium  matching  in  the  database.  The  Y-axis  represents  the  percentage  of 
unique  peptides  matched  with  95%  confidence  level  for  all  the  bacteria  on  the  x-axis.  In  this 
example  case,  the  identified  bacterium  at  strain  level  is  Y.  pestis.  The  horizontal  rcdlinc  is  the 
threshold  cutoff  under  which  common  degenerate  peptides  among  various  bacteria  within  the 
constructed  protcome  database  arc  shown.  These  degenerate  peptides  arc  removed  from  the  total 
number  of  unique  peptides  of  the  identified  species. 
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Figure  1.  MS-based  Protconiic  Approach  Output.  The  upper  section  represents  the  matching 
algorithm  results  of  the  identified  tryptic  peptides  resulted  from  the  LC-MS/MS  analysis.  Lower 
section  represents  the  binary  matrix  of  scqucncc-to-bactcrium  scoring.  Presence  of  a  unique 
peptide  corresponding  with  a  protein  in  the  given  protcomc  of  a  bacterium  is  scored  1,  non  match 
score  0. 


Organism 

Figure  2.  Histogram  Representing  the  Output  of  the  Binary  Matrix  of  the  Unique  Peptides 
Identified  for  a  Given  Bacterium  at  95%  Confidence  Level.  The  horizontal  line  is  the  threshold 
under  which  peptides  identified  arc  considered  statistically  non  significant. 


3.1 


Differentiation  of  Pathogenic  vs.  Non-pathogcnic  E.  coli  Strains 

Using  Whole  Cell  Lysates. 


Whole  cell  lysates  of  pathogenic  and  non-pathogcnic  E.  coli  strains,  E.  coli 
0157;H7  and  E.  coli  K12,  respectively,  were  prepared  and  analyzed  by  protcomic  mass 
spectrometry  as  described  above.  Results  showed  correct  identification  at  the  strain  level  for 
both  samples  analyzed.  The  near  neighbor  analysis,  using  Euclidean  distance  linkage  approach, 
for  these  lysed  bacterial  samples  showed  that  the  identified  unique  set  of  proteins  had  the  closest 
match  with  the  employed  E.  coli  strains.  Therefore,  correct  identification  to  the  strain  level  was 
achieved  for  both  bacterial  whole  cell  lysates  (Figures  3  a-b). 

Figure  3a  shows  correct  identification  of  one  sample  as  E.  coli  0157:H7,  with 
the  next  near  neighbor  being  E.  coli  UTI89,  the  causative  agent  of  human  urinary  tract  infections. 
Although  E.  coli  UTI89  is  closely  related  to  E.  coli  0157:H7,  it  is  missing  certain  proteins 
such  as  the  BAA35715  outer  membrane  and  flagella  related  proteins  that  arc  distinctly  expressed 
in  E.  coli  0157:H7,  but  not  in  E.  coli  UTI189.  Moreover,  the  analyzed  sample  of  the  non- 
pathogenic  E.  coli  K12,  shown  in  Figure  3b,  was  correctly  identified  as  E.  coli  K12,  yet  had 
equal  similarity  with  E.  coli  W31 10,  which  is  a  nonpathogcnic  strain  of  E.  coli  genetically  very 
closely  related  to  E.  coli  K12. 


E  Coll  0157  H7 


E  coli-K12 


Figure  3  a-b.  Near-neighbor  Classification  of  Pathogenic  E.  coli  0157:H7  (Figure  3a)  vs.  Non- 
pathogenic  E.  coli  K12  (Figure  3b)  Using  Whole  Cell  Lysates. 
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The  number  of  unique  proteins  identified  differed  between  the  pathogenie  and 
non-pathogcnic  E.  colt  strains,  with  the  pathogenic  strain  having  a  relatively  lower  number  of 
unique  proteins  (114)  than  that  of  non-pathogcic  E.  coli  (139).  Among  the  unique  proteins  of 
E.  coli  K1 2,  most  were  shared  with  E.  co//  0157:H7;  however,  the  few  numbers  of  proteins  that 
were  not  shared  could  potentially  be  used  for  strain  level  disemiination.  The  difference  in 
number  of  unique  proteins  eould  be  explained  by  the  fact  that  E.  coli  K1 2  has  been  extensively 
studied  more  than  any  other  E.  coli  strain  and  therefore  has  more  genetie  and  biochemical 
information  available  to  serve  as  a  foundation  for  interpreting  protcomc  sequences  from  other 
strains  (13).  This  difference  in  the  number  of  unique  proteins  between  the  two  mentioned  strains 
probably  contributed  to  the  difference  in  the  similarity  scoring  for  each  respective  strain  as 
shown  in  Table  1 . 

3.2  Differentiation  of  Pathogenic  vs.  Non-pathogcnic  E.  coli  Strains  Usinu  OMPs. 

The  results  of  using  OMPs  as  biomarkers  for  bacterial  differentiation  of 
pathogenic  versus  non-pathogcnic  E.  coli  strains  arc  shown  in  Figures  4a-b.  Each  E.  coli  strain 
was  correctly  identified  with  no  near-neighbor  strains  sharing  the  strain  level  identification. 
Figure  4a  shows  the  resultant  near-neighbor  similarity  linkage  analysis  for  OMP  extracts  from 
E.  coli  K1 2.  The  OMP  extract  resulted  in  a  unique  set  of  protein  biomarkers  that  arc  capable  of 
enhancing  the  differentiation  at  the  strain  level  and  resulted  in  complete  similarity  with  E.  coli 
K12  strain.  No  ambiguity  was  observed  in  the  identification  unlike  that  experienced  when  using 
the  whole  cell  lysates  in  which  an  equal  classification  was  shared  between  E.  coli  K12  and 
E.  coli  W31 10  strains  (Figure  3a).  Although  E.  coli  K12  and  E.  coli  W31 10  strains  arc 
genetically  indistinguishable  and  their  protein  content  appears  very  similar  when  analyzing 
whole  lysates,  a  distinct  difference  was  observed  (significant  dissimilarity)  between  the  two 
closely  related  strains  when  using  OMP  extracts. 

Figure  4b  shows  the  near-neighbor  similarity  linkage  results  for  the  OMP  extract 
of  E.  coli  01 57:H7.  Better  discrimination  of  E.  coli  01 57:H7  was  achieved  using  OMP  extracts 
than  that  observed  using  whole  cell  lysates.  The  number  of  unique  OMPs  that  could  be  identified 
was  greater  in  the  OMP  extracts  analyzed  than  that  observed  with  whole  cell  lysates.  However, 
this  does  not  imply  the  absence  of  these  OMPs  from  the  whole  cell  lysate.  Rather  it  is  likely  a 
higher  abundance  of  non-OMPs  in  the  whole  cell  lysate  that  is  suppressing  the  detection  of  the 
OMPs  in  the  whole  lysate  extracts.  MS  analysis  has  been  reported  to  suffer  ionization 
suppression  due  to  the  presence  of  large  numbers  of  ionizablc  species.  Generally,  the  whole  cell 
lysate  has  a  larger  number  of  ionizablc  peptides  and  greater  abundance  of  non-outer  membrane 
tryptic  peptides  than  that  of  OMP  extracts  and  therefore  is  highly  likely  to  experience  ionization 
suppression  during  MS  analysis. 
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Tabic  1.  Unique  Proteins  in  E.  coli  0157:H7  vs.  E.  coli  K12  (Potential  Bioinarkcrs) 


Protein  Aeeesion  # 

Protein  Info 

E.  coli-KI2  sub. 
HBlOl 

E.  coli-OI?'://' 

NP_4 18358.1 

stress-induced 

protein 

X 

NP_4 17795.1 

baeterioferritin,  iron 
storage  and 
detoxifieation 
protein 

X 

YP_671 573.1 

putative  eytoplasmic 
protein 

X 

NP_41 5386.1 

lipoprotein 

X 

NP  755058.1 

GnsAGnsB  family 
protein 

X 

NP_668903.1 

Chorismate  synthase 

X 

YP_670276,1 

Hypothetieal  protein 

X 

YP_669714.1 

Aspartyl-tRNA 

synthetase 

X 

NPJ 12864 

two-eomponent 
sensor  protein 
related  to 
pathogenicity 
islands 

x 

NP_3 10689.1 

Stnietural  flagella 
protein 

x 

NP_290256 

Secreted  protein 
EspA,  RELATED 

TO 

PATHOGENICITY 

ISLANDS 

x 

BAA35715 

Outer  memebrane 
protein 

x 

NP_286049 

putative  beta-barrel 
outer  membrane 
protein 
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Comparison  of  the  whole  eell  and  OMP  extraets  from  E.  coli  01 57:H7  showed 
distinet  differenees  in  the  nature  of  the  identified  unique  protein  biomarkers.  The  whole  eell 
lysate  for  E.  coli  01 57:H7  had  unique  proteins  it  shared  with  its  genetieally  elosest  strain,  E.  coli 
UT189.  However,  this  was  not  the  ease  when  eomparing  the  unique  proteins  for  these  two  strains 
using  the  OMP  extraets.  For  the  OMP  extraets,  the  differenee  in  the  number  of  strain  unique 
protein  biomarkers  inereased  as  eompared  with  that  of  the  whole  lysate  analysis. 
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Figure  4  a-b.  Near-neighbor  Classifieation  of  Non-pathogenie  E.  coli  K12  (Figure  4a)  vs. 
Pathogenie  E.  coli  0157:H7  (Figure  4b)  Using  OMP  Extraets. 


3.3  Differentiation  of  Pathogenie  vs.  Non-pathoaenie  Y.  pestis  Strains  Using 

Whole  Cell  Lysates  and  OMPs. 

Comparison  of  proteins  present  in  whole  eell  lysates  and  OMP  extraets  of 
pathogenie  versus  non-pathogenie  Y.  pestis  and  Y.  pestis  A1 122,  respeetively,  was  performed. 
Figure  5  a-b  shows  the  near-neighbor  analysis,  using  Euelidean  distanee  linkage  approaeh,  for 
the  baeterial  identifieation  based  on  OMP  extraets.  The  identified  unique  sets  of  proteins  had  the 
elosest  mateh  with  the  employed  Y.  pestis  strains.  However,  it  should  be  mentioned  that  the 
Y.  pestis  A1 122  strain  is  not  ineluded  in  the  eurrent  database  due  to  the  faet  that  its  genome  is  not 
fully  sequeneed  and  publiely  available.  However,  the  eonstrueted  proteome  database  does 
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include  all  the  pathogenic  Y.  pestis  strains  that  arc  listed  as  pathogenic  strains  in  the  DoD 
classification.  Figure  5a.  shows  the  dendogram  for  the  identification  of  the  avirulcnt  Y.  pestis 
A1 122  sample.  The  result  showed  that  this  strain  was  identified  at  the  strain  level  as  Y.  pestis 
91001,  which  is  also  an  avirulent  strain.  This  finding  is  encouraging  because  Y.  pestis  91001  is 
the  only  avirulcnt  strain  in  the  proteome  database  among  seven  pathogenic  Y.  pestis  strains 
currently  included.  Therefore,  the  absence  of  Y.  pestis  strain  A1 122  from  the  database  provided 
an  indirect  test  of  the  robustness  of  the  protcomic  approach  in  the  classification  of  non-database 
bacteria.  This  also  provides  additional  confidence  in  our  findings  in  which  identification  at  the 
species  level  was  correct  (Figure  5a-b).  Based  on  these  results,  the  unique  sets  of  proteins  for  Y. 
pestis  A1 122  mostly  resemble  those  found  in  the  identified  avirulcnt  Y.  pestis  91001  strain. 

Figure  5b  shows  the  identification  result  for  the  OMP  extracts  from  both  Y.  pestis 
strains.  This  figure  indicates  a  correct  strain  level  identification  of  the  studied  samples.  A  closer 
look  at  the  set  of  the  unique  protein  biomarkers  for  virulent  Y.  pestis,  shows  the  presence  of 
biomarkers  associated  with  virulence  factors.  For  example,  proteins  encoded  by  virulence 
plasmids  in  Y.  pestis  such  as  pPCPl  that  encodes  for  plasminogen  activator,  pCDl  that  encodes 
for  low-calcium  response  and  pMTl  that  encodes  for  murine  toxin,  the  structural  gene  for 
fraction  1  protein  capsule,  were  present.  The  latter  protein  was  present  in  higher  abundance  than 
that  of  the  other  mentioned  protein  biomarkers.  Y.  pestis  A1 122  lacks  the  pCDl  plasmid  and 
therefore  did  not  express  the  corresponding  OMPs  encoded  by  the  plasmid. 

Comparing  the  number  of  unique  proteins  for  the  employed  Y.  pestis  strains 
showed  a  difference  between  7.  pestis  C092  and  7.  pestis  A1 122.  The  fomicr  strain  had  191 
unique  proteins  versus  89  for  the  latter.  Upon  removing  the  highly  conserved,  house-keeping  and 
energy  transfer  proteins  from  both  sets,  the  number  of  strain  unique  proteins  for  7.  pestis  C092 
was  higher  than  that  for  7.  pestis  A1 122.  The  protein  biomarkers  that  were  observed  for  virulent 
7.  pestis  versus  7.  pestis  A\\22  were  present  upon  replicate  analyses  of  the  OMP  extracts  under 
different  sample  preparation  conditions  and  instrumental  analyses  parameters. 

Table  2  shows  the  comparison  of  the  strain  unique  proteins  for  7.  pestis  identified 
from  the  different  cellular  extracts.  Also  comparing  whole  cell  lysate  versus  OMP  extract 
showed  a  variation  in  the  number  of  strain  unique  protein  biomarkers  in  OMPs  versus  whole  cell 
lysates.  The  number  of  strain  unique  proteins  was  slightly  higher  for  pathogenic  7.  pestis  from 
OMPs  extracts  versus  whole  cell  lysate.  Few  unique  biomarkers  were  shared  and  virulence 
factors  seemed  to  be  present  in  higher  abundance  in  the  OMP  extracts  than  in  the  whole  cell 
lysates,  which  is  in  support  of  reported  literature.  (14). 
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Figure  5  a-b.  Near-neighbor  Classifieation  of  Non-pathogenie  Y.  pestis  A1 122  (Figure  5a)  vs. 
Pathogenie  Y.  pestis  (Figure  5b)  Using  OMP  Extraets. 
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Table  2.  Unique  Proteins  for  Y.  pestis  from  Whole  Cell  Lysate  vs.  OMP  Extraets 


y.  pestis  C092  Unique  proteins-Whde  Cell  Extract 

y.  pestis  C092  Unique  proteins-OMPs  Extracts 

305  ribosomal  protein  56 

505  ribosomal  protein  L5 

505  ribosomal  protein  L32 

murine  toxin 

Aeidnnduced  glyeyl  radical  enzyme 

attachment  invasion  locus  protein 

cationie  19  kDa  outer  membrane  protein  p 

elongation  factor  Ts 

cationic  19  kDa  outer  membrane  protein  pr 

elongation  factor  G 

chorismate  mutase 

putative  outer  membrane  porin  A  protein 

DNA-binding  protan  HU-alpha 

hypothetical  protein  plu4065 

femtin 

major  outer  membrane  lipoprotein 

hypothetical  protein  YP 0808 

putative  outer  membrane  porin  A  protein 

hypothetical  protein  YP 1194 

putative  outer  membrane  porin  A  protein 

hypothetical  protein  YP 1779 

Add-induced  glyeyl  radical  enzyme 

hypothetical  protein  YP 1779 

attachment  invasion  locus  protein 

hypothetical  protein  YP pCD78 
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attachment  Invasion  locus  protein 

hypothetical  protein  y2159 

hypothetical  protein  YP 3210 

malate  dehydrogenase 

manganese  superoxide  dismutase 

3.4  Differentiation  of  Pathogenie  vs.  Non-pathogenie  B.  anthracis  Strains 

Using  Whole  Cell  Lysates. 

Pathogenie  and  non-pathogenie  B.  anthracis  Ames  and  Sterne,  respeetively,  were 
analyzed  by  proteomie  mass  speetrometry  for  identifieation.  Figure  6a-b  shows  the  histogram  for 
the  sequenee-to-baeterium  binary  matrix,  with  the  number  of  unique  peptides  on  the  y-axis  and 
baeterium  proteome  on  the  x-axis.  As  seen  in  this  figure,  eorreet  identifieation  of  eaeh  strain  was 
made,  but  with  a  higher  eonfidenee  level  for  B.  anthracis  Ames  than  B.  anthracis  Sterne  strain. 
This  observation  eould  likely  be  attributed  to  the  faet  that  B.  anthracis  Ames  has  an  additional 
plasmid  laeking  in  B.  anthracis  Sterne  strain  (pX02)  and  therefore  is  not  an  expressed  protein 
biomarker  deteetable  in  the  MS-Proteomie  analysis  for  that  of  B.  anthracis  Sterne.  In  Figure  6b, 
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the  presence  of  B.  anthracis  Ames  slightly  above  the  threshold  cutoff  of  95%  confidence  level 
supports  such  an  observation.  This  was  not  the  ease  with  B.  anthracis  Ames  samples  where  a 
distinct  identification  unadulterated  by  the  presence  of  other  B.  anthracis  strains  was  observed. 
The  strain  unique  peptides  observed  with  both  B.  anthracis  strains  arc  an  indication  of  the 
application  spectrum  of  such  an  approach.  Although  B.  anthracis  strains  do  not  possess  OMPs, 
using  their  whole  cell  lysate  was  sufficient  to  reveal  the  discrimination  power  of  the  MS-based 
protcomic  approach.  It  is  evidence  that  using  higher  concentrations  and  optimization  of  the  lysis 
protocols  could  enhance  the  MS-based  protcomic  analysis  and  provide  a  use  for  Bacillus 
differentiation  at  the  strain  level. 
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Figure  6  a-b.  Bacterial  Differentiation  of  B.  anthracis  Strains  Using  Whole  Cell  Lysates. 

Figure  6a  represents  the  identification  of  5.  anthracis  Ames  strain,  while  Figure  6b  represents 
the  identification  of  B.  anthracis  Sterne  strain.  X-axis  represents  bacterium  protcomc  and  Y-axis 
represents  number  of  unique  peptides  at  95%  confidence  level.  The  horizontal  line  is  the 
threshold  under  which  peptides  identified  arc  considered  statistically  non-significant. 


Further  work  should  be  conducted  to  investigate  the  strain  unique  peptides  for 
B.  anthracis  strains  to  understand  their  characterization  in  a  biologically  meaningful  medium. 
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Doing  so  will  increase  our  knowledge  of  the  set  of  functional  proteins  responsible  for  strain 
virulence.  Determining  their  corresponding  set  of  genes  that  can  be  biological  manipulated  under 
different  environmental  conditions  will  be  of  great  interest  to  measure  the  validity  of  strain 
differentiation. 


4.  CONCLUSIONS 

This  project  revealed  the  advantage  of  using  OMPs  as  unique  biomarkers  for 
bacterial  differentiation  of  pathogenic  versus  nonpathogcnic  strains.  The  differentiation 
capability  enhanced  the  confidence  level  of  the  discrimination  process  through  the  utilization  of 
OMPs  as  biomarkers.  OMPs  provide  a  unique  source  of  cellular  variability  and  thus,  introduce 
biodiversity  among  cellular  proteins  for  very  similar  bacterial  strains  and  thereby  provide  distinct 
and  unique  protein  biomarkers.  The  whole  cell  lysates  did  provide  discrimination;  however,  the 
possible  ionization  suppression  could  shield  the  detection  of  important  peptides  that  could  be 
classified  as  unique  biomarkers.  On  the  other  hand,  whole  cell  lysates  arc  an  appropriate  option 
for  the  differentiation  of  gram-positive  bacterial  strains  and  the  reported  results  herein  support 
their  potential  application  in  differentiation.  Overall,  an  extension  of  this  project  to  include  a 
wider  investigation  of  relevant  pathogenic  bacteria  such  as  Francisella  tiilarensis,  Biirkholderia 
spp.  and  other  relevant  strains  could  provide  us  with  a  more  global  outlook  of  the  importance  of 
OMPs  with  regard  to  pathogenicity  and  how  we  can  confidently  identify  organisms  at  the  strain 
level  using  protein  biomarkers  for  bacterial  classification  and  diagnostic  purposes. 
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